Switch-LSTMs for Multi-Criteria Chinese Word Segmentation
نویسندگان
چکیده
منابع مشابه
Adversarial Multi-Criteria Learning for Chinese Word Segmentation
Different linguistic perspectives causes many diverse segmentation criteria for Chinese word segmentation (CWS). Most existingmethods focus on improve the performance for each single criterion. However, it is interesting to exploit these different criteria and mining their common underlying knowledge. In this paper, we propose adversarial multi-criteria learning for CWS by integrating shared kn...
متن کاملMulti-Grained Chinese Word Segmentation
Traditionally, word segmentation (WS) adopts the single-granularity formalism, where a sentence corresponds to a single word sequence. However, Sproat et al. (1996) show that the inter-nativespeaker consistency ratio over Chinese word boundaries is only 76%, indicating single-grained WS (SWS) imposes unnecessary challenges on both manual annotation and statistical modeling. Moreover, WS results...
متن کاملEffective Neural Solution for Multi-Criteria Word Segmentation
We present a simple yet elegant solution to train a single joint model on multi-criteria corpora for Chinese Word Segmentation (CWS). Our novel design requires no private layers in model architecture, instead, introduces two artificial tokens at the beginning and ending of input sentence to specify the required target criteria. The rest of the model including Long ShortTerm Memory (LSTM) layer ...
متن کاملChinese Word Segmentation for Agriculture
Based on the Hash mechanism, a new algorithm is presented, the algorithm can realize search, update, deletion and addition operations for dictionary. According to the characteristics of Chinese characters GB code, by preserving the GB code of first word in entry, this method effectively improves the utilization rate of the storage space. In the dictionary, the one-to-many corresponding relation...
متن کاملWord Segmentation for Chinese Novels
Word segmentation is a necessary first step for automatic syntactic analysis of Chinese text. Chinese segmentation is highly accurate on news data, but the accuracies drop significantly on other domains, such as science and literature. For scientific domains, a significant portion of out-of-vocabulary words are domain-specific terms, and therefore lexicons can be used to improve segmentation si...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence
سال: 2019
ISSN: 2374-3468,2159-5399
DOI: 10.1609/aaai.v33i01.33016457